Psicoldgica (2013), 34, 97-123. 


Rasch Analysis for Binary Data with Nonignorable Nonresponses 

Lucio Bertoli-Barsotti & Antonio Punzo 

1 2 

University of Bergamo, Italy; University of Catania, Italy 


This paper introduces a two-dimensional Item Response Theory (IRT) 
model to deal with nonignorable nonresponses in tests with dichotomous 
items. One dimension provides information about the omitting behavior, 
while the other dimension is related to the person’s “ability”. The idea of 
embedding an IRT model for missingness into the measurement model is 
not new but, differently from the existing literature, the model presented in 
this paper belongs to the Rasch family of models. As a member of the 
exponential family, the model offers several advantages, such as existence 
of non trivial sufficient statistics and possibility of specific objective 
parameter estimation; feasibility of conditional inference; goodness of fit 
analysis via conditional likelihood ratio tests. Maximum likelihood 
estimation is discussed, and the applicability of the proposed model is 
illustrated by using a real data set. 


Introduction 

In applications of Item Response Theory (IRT), researchers are often 
faced with the problem of missing data. This problem may be particularly 
critical when the ignorability principle does not hold (Rubin, 1976; Little & 
Rubin, 2002). Missing data are said to be missing at random (MAR; Little 
& Rubin, 2002, p. 12) when the process generating the pattern of 
missingness does not depend on the unobserved data, although it may 
depend the observed data. When the probability of missingness is 
independent of both unobserved and observed data, the missing data 
mechanism is called missing completely at random (MCAR; note that 
MCAR implies MAR). When data are MAR, the missing data mechanism 
may be considered “ignorable” for likelihood-based inferences if the 
parameter of interest and the parameter of the missingness process are 
distinct (Little & Rubin, 2002, pp. 117-120). Roughly speaking, Rubin's 
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ignorability principle states the conditions under which ignoring the process 
that causes the missingness does not generate any systematic errors in the 
parameter estimates. 

In a broad sense, we may distinguish two main cases of missingness. 
The case in which the observations result from an incomplete design of the 
test decided by the researcher (planned missingness; e.g. random 
incomplete designs, multistage testing designs, targeted testing designs), 
and the case in which the missing may be considered as the effect of a 
choice of the respondent (unplanned missingness). When data are missing 
by design, usually (for possible exceptions see Eggen & Verhelst, 2011) 
randomization warrants they are at least MAR (Little & Schenker, 1995, p. 
43; Schafer, 1997, p.20-22 and p.62; Mislevy & Wu, 1996). In this paper 
we mostly consider the latter case, and more specifically the case in which 
a) an item is presented to a person, b) that person has time to consider it 
(thus, we do not refer to “not reached” items), but c) decides, for whatever 
reason, to not respond (Mislevy & Wu, 1996). This choice may be due to 
inability to understand, or unwillingness to respond for embarrassment, 
anger, discomfort, or other reason that may, or may not, depend on the 
latent trait to be measured. As it is well known, in these cases missingness 
is not generally ignorable. A typical situation is represented by items which 
are skypped because of the low “proficiency” of the respondent. But other 
examples can be mentioned. As an example, consider the problem of 
measuring “ability” to perform activities of daily life. With this aim, 
Holman and Glas considered the ALDS item bank (see Holman, 
Lindeboom, Vermeulen, Glas, & de Haan, 2001, for details), founding that 
"patients with a higher proficiency level tended to boost their rating by 
failing to respond, while the patients of low proficiency were less inclined 
or motivated to impress the nurses" (Holman & Glas, 2005, p. 9). These 
intentionally omitted responses represent a case of nonignorable 
missingness, because nonresponse depends on the unobserved data; in other 
words, the number of items endorsed is correlated with the respondent’s 
proficiency level. 

In this article, we focus on the situation in which a person latent trait 
ff is measured by a test composed of k dichotomous items. The parameter 
may be thought as the amount of “ability” (i.e. proficiency, but also 
agreement, motivation, belief, attitude, capacity, intention, and so on) of 
respondent v ( v = 1,...,«). For ease of exposition, in what follows we refer 
to this latent trait as “ability”. Original response categories (e.g. yes/no, 
right/wrong, agree/disagree, correct/incorrect, and so on) are supposed to be 
recoded as “1” and “0”, defining in this way a binary response variable X vi , 
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with reference to the subject v and the item indexed by i (i = 
Conditionally to the fact that the response was observed, we assume that 
X vi follows a Bernoulli distribution depending on the latent trait -and 
other item parameters- through an IRT model. We shall refer to this model 
as measurement model. The observations are collected into a nxk dataset 
under the basic assumption of the conditional independence between 
responses given ff (local independence). 

All the models considered in the present paper allow to take into 
account the presence, if any, of a nonignorable missing data mechanism - 
due to omitting behavior. An appropriate method to deal with this type of 
missingness is to incorporate the mechanism that caused the missingness 
into the measurement model. A possible approach consists in using a latent 
variable model that is a function of a two-dimensional latent trait 
where | is related to the response propensity level. This model-based 
approach allows missing values to be included into the analysis and, equally 
important, it allows information about attitude to be inferred from 
nonresponse. From now on, for convenience, we will write | = 0j and 

i) = 0 2 , where 0, and 0 2 are the components of a vector valued latent 

variable 0. This approach was introduced by Knott, Albanese and Galbraith 
(1990) and Albanese and Knott (1992). Successively, the same model (in 
this paper this model is defined in equation (3), in the next section) was also 
studied and applied, among others, by Knott and Tzamourani (1997), 
Bartholomew, de Menezes and Tzamourani (1997), O'Muircheartaigh and 
Moustaki (1999), O'Muircheartaigh and Moustaki (1996) - and, in a slightly 
more general form, by Moustaki and Knott (2000), Moustaki and 
O'Muircheartaigh (2000), Moustaki and O'Muircheartaigh (2002). More 
recently, a unified approach to a more general class of models has been 
proposed in a seminal paper by Holman and Glas (Holman & Glas, 2005; 
see also Glas & Pimentel, 2006, and Pimentel, 2005, Chapter 2). 
Comparative studies of alternative models within the family of Holman and 
Glas (2005) are also given by Rose, von Davier and Xu (2010). Now, all the 
models in that literature are instances of two-dimensional IRT models; more 
specifically, by construction, they do not belong to the exponential family 
of distributions. What is more, this precludes the use of a conditional 
approach to the estimation. 

Following the above said model-based approach for the treatment of 
nonignorable nonresponses, we introduce a new two-dimensional IRT 
model - that belongs to the Rasch family of models - for the analysis of 
dichotomously scored items in the presence of nonignorable nonresponses. 



700 


L. Bertoli-Barsotti & A. Punzo 


called Rasch-Rasch Model (RRM). As a member of the exponential family, 
the RRM offers several advantages: 1) Existence of non trivial sufficient 
statistics and possibility of specific objective parameter estimation; 2) 
Feasibility of conditional inference; 3) Known conditions for identifiability 
of the model parameters; 4) Known necessary and sufficient conditions for 
existence and the uniqueness of the Conditional Maximum Likelihood 
(CML) estimates; 5) Existence of a conditional likelihood ratio test for 
goodness of fit. 

This article is organized as follows. The next section contains a brief 
review of the existing models. The third section contains the formulation of 
the proposed model. Then, minimal sufficient statistics for the model 
parameters are presented. In the subsequent section the problem of the 
estimation is investigated. In the sixth section a small simulation study is 
carried out to evaluate parameter recovery. A case study is finally 
considered in the seventh section. 

A standard non-Rasch approach to the problem 

Let D vi be the random variable response indicator of X vi , assuming 
value 1 if the response of person v on item i was observed, and 0 otherwise. 
If D vi = 1, the response variable X vi may assume the values 0 and 1. 
Otherwise X vi = c, where c is an arbitrary constant. 

According to the model-based approach for the treatment of 
nonignorable nonresponses, the underlying process leading to the values of 
X vi can be explained in a hierarchic way, by two different steps. At step 1, 
when subject v encounters item i, she/he can choose to answer or not. This 
step is described by the random variable D vi . At step 2, if subject v has 
chosen to respond (i.e. D vi = 1), then she/he can respond in the category 
coded as “0” (i.e. X vi =0) or in the category coded as “1” (i.e. X vi =1) 
depending on her/his proficiency level fl = 0 2 . This step is governed by the 
random variable X vj D vj , taking values 0 and 1. Summarizing, there are 
three possible response patterns for any item, say 

A = (D = 0,X = c) , B = (D = l,X = 0) and C = (D = 1,X = l), 

where the subscripts v and i have been omitted, for brevity. We will refer to 
these three patterns as “options” to distinguish them from the original 
dichotomous response categories coded as “0” and “1”. Then one can 
postulate the following model 
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P(A) = P(D = 0), P(b) = P(x = 01 D = 1 )p(D = 1), 
p(c) = p(x = 11 D = 1 )P(D = 1). 

According to the described process, the probability of any response 
pattern ( D vi = d vi ,X vi = x vi ) can be given explicitly as 

P(D„-d„,X,.x,,).P{D,.d„)\P(X„-xJD,.l)] d -. (1) 

Let us denote by n viA , n viB and jt vjC , respectively, the probabilities of 

the events: (D„-0,X„.c), (D„ - l,X„.-0) and (D„-\,X„-\). A 

possible IRT model for each of the terms on the right-hand side of equation 
(1) is represented by the two-parameter logistic (2PL; Bimbaum, 1968) 


model, which leads to 
| n JtwJ + ^viC 

71 vi A 


a \i ®lv ^1; ’ 


In—= « 2 i 6 2v -6 2; 

^viB 


( 2 ) 


where a u and a 2i are item discrimination parameters and the deltas are 
item parameters. 

In order to obtain parameter estimates, one could consider Marginal 
Maximum Likelihood (MML) in which person parameters are integrated 
out by marginalization, assuming a two-dimensional normal distribution, 
with density g(0)=g(0 1 ^,,2) , for the latent variable 0 , with mean vector 
|X =0 (corresponding to an identifiability constraint) and unknown 
covariance matrix 2 . This model is called G 2 in the taxonomy of Holman 
and Glas (2005); they also perform a simulation study by considering G 2 in 
its simplest form, given by a u = a 2i =1. Interestingly, in this way, although 

each logit equation in (2) represents a Rasch Model (RM; Rasch, 1960) for 
dichotomous items, the whole model (1) does not belong to the Rasch 
family of models. 

In order to provide possibly more meaningful parameters, the model 
given in (2) can also be reparameterized by writing a' u 0 v instead of a u 0 lv 

and a' 2i 0 V instead of a 2i 0 2v , where a u = (a Ui ,a l2i ) and a 2i = (a 2U ,a 22i ) are 
vectors of item discrimination parameters. This means that the variables D vi 
and/or X vj \ D vi may depend (possibly) on both the components 0, and 0 2 

of the latent trait 0 . In that way, we obtain the following more general 
version of the previous model (2) 
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ln 7 KiB + K viC 


Jt. 


Jt. 


a ni ®lv + a m ®2v ’ l n a 2li ®lv + a 22i ^2v ^ 


vi A 


Jt 


'viB 


where it may be noted that both these logit equations are now instances of a 
multidimensional 2PL (M2PL) model for dichotomous items (Rackase, 
2009, p.86). According to these logit equations, the response probability 
function can be expressed explicitly as follows 

n vu = [l + exp«6 v -6 h .)] ‘ 

Jt wB = exp«e v -6i ; .){[l + exp«e v -6 u )] [l + exp«.0 v -8 2; )]} ' (3) 

jt wC = exp«0 v + a' 2 i Q v - b u -b 2i ) {[1 + exp(«' 0 ,, - 8 U )] [1 + exp(«;.0 v - 



To identify this model, we need to impose some constraints on the 
discrimination parameters a u and a 2i . In particular, by taking a 2U = 0, we 
obtain a model studied by Knott and Tzamourani (1997) (see also 
O’Muircheartaigh & Moustaki 1999, Knott et al., 1990, Albanese & Knott, 
1992), that is: 


ln n viB + KiC 


Jt. 


a ui ®iv + a \2i ®2v ( \> ’ 


= a 22i 0 2v ” b 2i ■ 


(4) 


'vi A 


Jt, 


viB 


Model (4) has also been studied by Rose et al. (2010; referred to as 
“Model 7”). Holman and Glas (2005) denoted this model as G 3 , while the 

case a Ui = 0 is referred to as model G 4 and model G 2 is obtained by taking 
a 2U = a Ui = 0. It is important to realize that in all these models item and 
person parameters generally have quite different meanings; in particular, the 
correlation between the components of the latent variable 0 will differ 
across these models. It should be noted that for technical reasons, in the 
general case of models G 3 and G 4 , the covariance matrix 2 should be 
constrained to be an identity matrix to allow the model to be identified. 


A Rasch approach to the problem 
Formulation of the model 

A well-known drawback of the MML approach is that, if the 
distributional assumptions about 0 are wrong, the method is not consistent 
(Pfanzagl, 1994). As an alternative to the MML procedure one could also 
consider the Conditional ML (CML) method for item parameter estimation. 
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The advantage is that CML does not require distributional assumptions 
about 0 . However, this method essentially requires that the model belongs 
to the exponential family. Now, to obtain a model that belongs to this 
family, in this paper we suggest to consider the following alternative 
formulation of model (2) 


ln^ 


TC 


= 0„-s„. 


vi A 


ln^ = 0 2v -6 2r 

^viB 


(5) 


Since each logit equation in (5) assumes the standard form of an RM, 
and the whole two-dimensional model model belongs to the Rasch family of 
models, (5) is called Rasch-Rasch Model (RRM). In its simplest form (but 
see below for generalizations), this model can be written succinctly as 


P(D vi =d vi ,X vi 



ex P [ f (e lv - (e 2v - s 2 ,)' 

1 + exp (0 lv - 6„) + exp (6 lv + 0 2v - b u - b 2i )' 


From a comparison between the models given in (5) and in (2), it is 
straightforward to note that -while person and item parameters 0 2 and 6 2 
maintain a similar interpretation for both these models, as “ability” and 
“difficulty” a difference may be found between the distinction B versus A, 
for the RRM, instead of (B or C) versus A, for the model given in (2). As a 
consequence, person and item parameters 0, and 6j have different meaning 
in the two models; more specifically, for example, in model (5) the item 
parameter 6j indicates the relative “difficulty” of choosing option B rather 
than A. This “slight” modification in the parameterization has the great 
advantage to pose the RRM in the exponential family preserving, although 
with a different interpretation of the parameters, the idea of embedding a 
model for missingness in the measurement model. Obviously, sometimes 
other response options may appear to be interesting and informative (e.g. C 
versus A). In these cases, alternative parameterizations of the RRM may 
also be adopted. 


Alternative parameterizations and relationship with other families of 
IRT models 

There are several types of alternative RRM parameterizations to 
choose from. Indeed, in analogy with the previous section, the model given 
in (5) can also be reparameterized by writing 

a[ 0 V instead of 0 lv and a' 2 0 v instead of 0 2v , 


( 6 ) 
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where a { = (a M ,a l2 ) and a 2 = (a 2f ,a 22 ) are vectors of known and fixed 

coefficients of the trait parameters (rather than unknown discrimination 
parameters, as in the models of the previous section). According to the 
above reparameterization, the response probability function can be 
expressed explicitly as follows 

= [! + exp(a|e v - 5 U ) + exp(a|0 v + a'Q v - 8 U - d 2i )] 1 
n viB = exp(«|'0 v - 8 U ) [1 + exp(«;e v - 8 U ) + exp(a|0 v + a' 2 Q v - 6„ - 6 2 ,)] 1 (7) 
n«c = exp(«i'8 v + a 2 0 v “ 6 u - ) [! + exp(flj'0 v - 8 U ) + exp(a|0 v + a' 2 Q v - 

»u-» 2 i)] _I . 

where it is understood that the vectors a x and a 2 have to be linearly 
independent to identify the model parameters. Note that since these 
constants are supposed to be known, the model remains within the Rasch 

family of models; in particular, the choice a t = (l,0) and a 2 = (0,l) yields 
the RRM in its simplest form, given in (5). Although all these models are 
identical up to reparameterization (6), the meaning of the latent variables 0j 

and 0 2 may change. 

By comparing the probability functions in (3) and (7), it is apparent 
that the models are similar, but inherently different. In particular, it is worth 
noting that the RRM cannot be seen in any way as a special case of the class 
of models described in the second section. However, the RRM may be 
related to other known families of IRT models. More specifically, if A is 
recoded as “0”, B is recoded as “1”, and C is recoded as “2”, the RRM (5) 
can also be viewed as an instance of the Multidimensional Polytomous 
Latent Trait (MPLT) model (Kelderman & Rijkes, 1994) - but it should be 
noted that the three response options A, B and C are not ordered, in the 
RRM framework. By the way, it is interesting to note that if we assume the 
unidimensionality of the latent trait, that is 0j = 0 2 , the same recoding 
generates the well-known Partial Credit Model (PCM; Masters, 1982). 

Since there is no total order among the options A, B and C, the RRM 
can also be considered as a special case of a Multidimensional Nominal 
Response Model (MNRM; Bolt & Johnson, 2009) where some of the 
parameters are constrained. Indeed, choosing option B as a reference point 
for every item, the MNRM can also be written as 
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it viA °c exp (b' u Q v + c u ), n viC oc exp (b 2i Q v + c 2i ), 

where b u and b 2i are vectors of discrimination parameters, and c u and c 2i 
are intercept parameters (Thissen, Cai & Bock, 2010). Since the RRM (5) 
yields in particular the probabilities 

n _ exp[-(9iv-di,-)1 _ 

1 + exp [-(0 lv - b u )] + exp(0 2v -6 2; ) ’ 

n _ exp(e 2v -6 2 ,) _ 

1 + exp [-(0 lv - b u )] + exp(0 2v -6 2; ) ’ 

the model is equivalent to the MNRM by imposing the following 
constraints 


4,-(-1,0)'and A,,-(0,1)' 

for every item i. With the same parameterization, the RRM can also be 
viewed as an instance of a Multicategorical Multidimensional Rasch Model 
(Rasch, 1961; Andersen, 1973). Finally, by construction, the RRM may be 
seen (if the MML approach to the estimation is adopted) as an instance of a 
Multidimensional Random Coefficients Multinomial Logit Model 
(MRCMLM, Adams et al., 1997), which is the most general structure of a 
multidimensional Rasch model. It may also be noted that, since in our case 
each of the items relates to more than one dimension (0j and 0 2 ), the model 
will be considered as a within-item multidimensional IRT model. To be 
precise about the specific kind of within-item multidimensionality, we may 
observe that in the RRM, for every item, the response process for each 
single option requires different latent traits. 

Minimal sufficient statistics for the RRM parameters 

Given a nxk dataset, under the usual assumption of local 
independence, the log-likelihood function for the RRM in its most general 
form, with reparameterization (6), is 

/ +a 22 * v .)0,„]-£(</. l 6 w +X„6 W ) + C (8) 

Tm\ M 

C = -2 ln I 1 + eX P( a i ,0 v - 6 a ) + eX P(«l' 0 v + «2 0 v - Si/ - 6 2/ )] 

V=1 /=1 


where 
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and where 


n 


n 



In particular, if a x and a 2 are linearly independent, a^dp. + 
a 21 x v ., a 12 d v . + a 22 x v . -or, equivalently, ( d v .,x v .)~ and {d. i ,x. i ') result to 
be the joint minimal sufficient statistics for the parameters (0 lv ,0 2v ) and 

(8 u ,5 2i ), respectively. Note that, by definition, the totals d v . and d. t 

represent the number of given answers by row and by column, respectively. 
Similarly, the totals x v . and x. t represent the number of responses in the 
category coded as “1” by row and by column, respectively. 


Model estimation 


Conditional maximum likelihood estimation approach 

As said before, the MML method constitutes the standard estimation 
technique for the models presented in this paper. Nevertheless, for the 
RRM, we can further consider the CML method. In the CML approach, 
which is the natural approach in the exponential family framework 
(Lehmann, 1983), person parameters 0 lv and 0 2v , considered as nuisance 

parameters, are eliminated by conditioning on their sufficient statistics d v . 
and x v .. In order to estimate the item parameters of the RRM (5), one can 
maximize the conditional log-likelihood function 



k k 


where 



with the summation > runn j n g across all answer patterns 


I 

I I J 



x = Xj, x 2 ,..., x, ) , coupled with the response indicator vector 
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* / * * md v. y mx v . 

d = id\ , d 2 ,..., d k j , such that and ‘ ‘ . Then, 

y(r,t | 8j, 8 2 ) denotes a special case of “elementary symmetric function” of 
order (r, t) (where, by construction, r > t) of the parameters 5, 
and ^2k ■ P° r example, consider a test with k = 4 items; for the 

case of d v . = 3 and x v . = 2 we find the elementary symmetric function 

y(3,2) = rvn 2 T 3 +rvn 3 T 2 + tW c i +W T 4 +m 4 x i +^4 x i + 


+TW4 + r l3 r l 4 ' C l +1 ldl4' C 2 

where x. = exp(-6 1; ) and r 1; = exp(-6 l; -d 2; ). 

CML item parameter estimates are obtained by maximizing l c under 

k k 

two identifiability constraints, say ^8 h . =0 and ^ b 2i =0. In a second 

step, these CML estimates can be substituted, as if they were the “true” item 
parameters, into the log-likelihood function (8) to give a profiled version of 
l that can be maximized with respect to 0. In this way, both item and 
person parameter estimates are obtained. 


Marginal maximum likelihood estimation approach 

Obviously, the MML approach is also possible for estimation of the 
parameters of an RRM. We remember that this approach is inherently based 
on what Holland (1990) called the random sampling rationale - i.e. the 
assumption that the subjects are sampled at random from a population with 
a specified distribution (e.g. a normal distribution). This method makes 
explicit use of the density functiong(0) of the latent variable, in order to 

obtain a likelihood function where the person parameters are integrated out. 
In other terms, in the random sampling perspective, the parameter 0 is 
simply a variable of integration and cannot be estimated. In our case, as 
already assumed in the second section, we will use a normal density 
g(01 0,2), leading to the following marginal log-likelihood function 

/ M (5 l .8 2 .2)-|lnJ sil jj/>( c ( - ,A : J9,5,„5 2 ,)g(9|O.2)rf0 

which can be maximized to obtain estimates of item parameters and of the 
unknown covariance matrix 2. Possibly, estimates of the individual’s 
position in the two-dimensional latent space are obtainable with a similar 
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procedure as described for the CML approach, or by using the expected a 
posteriori (EAP) values given the response patterns. In the perspective of 
adopting the MML approach, it is useful to recognize that the RRM is an 
instance of a MRCMLM - for a convenient choice of both the design and 
the scoring matrices. Operationally, the computer program ConQuest (Wu, 
Adams, Wilson & Haldane, 2007), allowing for MML estimation of the 
parameters for a MRCMLM, can be used for the RRM too. 

Simulation 

A simulation study is conducted to assess parameter recovery and the 
effect of the latent correlation parameter on these estimates. The data are 
randomly generated from the RRM, in its form given in (5), and analyzed 
using both 

a) the generating model; 

b) the simple RM (with “difficulty” parameter 6^) ignoring the 
missing data mechanism. 

The R software (R Development Core Team, 2008) was used to 
generate the data. For every replication, a random sample of size n = 10000 
of latent trait values (OpOj) is drawn from a two-dimensional normal 
distribution with means zero, variances 1 and correlation p , with 
p =0.0,0.2,0.6,0.8; two different sets, 6 n ,6 12 ,...,6 1A and °f 

A: = 15 item parameters values are randomly selected from a standard 
normal distribution. Then, for each person a response vector is generated. 
One hundred replications are made for each fixed value of p . For each 

dataset, we compute the MML item parameter estimates, say 6f' v/ 

for the RM, and the estimates 6 U , S 2( . and p for the RRM (note that in this 
section we use a different notation to distinguish the “difficulty” parameter 
between the two models, i.e. 6f ;V/ for the RM and b 2i for the RRM), using 

the calibration program ConQuest. Parameter estimation is done with the 
Monte Carlo method, using 2000 nodes and a convergence criterion of 
0 . 0001 . 

The accuracy of the parameter estimates is assessed with the criterion 
of the evaluation of the average bias (BIAS), and a sample version of the 
root mean squared error (RMSE; i.e. the root of the averaged squared 
deviation between a parameter and its estimate) as follows 
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1 7=1 K M 


where r is the number of replications, (J) y . is the estimated parameter of item 
i (i.e. for the RM, and 6 1; ., b 2j for RRM) in replication j, and 

represents the corresponding generating value of that parameter and 
replication. Table 1 gives the results for the item parameters, while Table 2 
shows the results for the correlation parameter. 


Table 1. Item parameter recovery for RRM and RM 


Generating value of p 


Model 

Parameter 


0.0 

0.2 

0.4 

0.6 

0.8 


6, 

BIAS 

•0.0006 

0.0018 

0.0043 

0.0028 

0.0042 

RRM 


RMSE 

0.0336 

0.0316 

0.0328 

0.0330 

0.0327 


b 2 

BIAS 

-0.0006 

-0.0019 

-0.0001 

-0.0001 

-0.0015 



RMSE 

0.0340 

0.0316 

0.0326 

0.0331 

0.0333 

RM 

6*“ 

BIAS 

•0.0610 

-0.0890 

-0.1153 

-0.1442 

-0.1771 



RMSE 

0.0758 

0.1021 

0.1301 

0.1601 

0.1947 


Note. The calibration of RM is done under the assumption of MAR. 
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The BIAS for parameters 6j and 6 2 is always very close to zero, 
while for b RM it grows, in its absolute value (the estimator seems to be 
biased downward), as p increases - as suggested by Figure 1. As a 
consequence, as can be seen in Table 1, this produces an inflation of the 
RMSE for the parameter b RM ; as expected, with respect to the RM, the 
RRM produced the smallest RMSEs under all the considered values of p . 
These results indicate that item parameters can be estimated accurately 
under the proposed model, whereas the approach based on the simple RM 
ignoring missingness may yield biased estimates when missingness depends 
on the latent variable to be measured (i.e. 0 2 ). 


Table 2. Correlation parameter recovery 


Generating value of p 



0.0 

02 

0.4 

0.6 

0.8 

BIAS 

-0.0017 -0.0058 

-0.0086 

-0.0076 -0.0033 

RMSE 

0.0175 

0.0184 

0.0192 

0.0154 

0.0149 


Application to data on racial prejudices 

In this section we describe the application of the RRM to data 
analyzed by Knott and Tzamourani (1997) through model (4). As a special 
case of the models given in (3), this model is the most similar to the RRM 
(5) because: 

0 2 represents the latent trait that governs the distinction C versus 
B. Then, for both the models, 0 2 may be interpreted as the 
“ability”; 

The probability of a nonresponse is explicitly allowed to depend 
on both the latent trait dimensions 0, and 0 2 , for both the models. 
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Average bias of "difficulty" parameter estimates - RRM vs RM 


0.00 


-0.05 

U) 

.2 

n 

Q) 

% - 0.10 

V 

> 

< 


-0.15 


- 0.20 
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Figure 1. Bias of the “difficulty” parameter estimates, under RRM (6 2 ) 
and RM ( b RM ): relationship with the correlation parameter p . 


Therefore, for comparative purposes, we decided to calibrate the same 
dataset using the RRM. The dataset considered by Knott and Tzamourani 
(1997) consists of the answers of n = 1408 "white" respondents to the 
following k = 4 items from the British Social Attitudes Survey 1991 
(Brook, Prior & Taylor, 1992): 

Item 1. Thinking of black people - that is, people whose families were 
originally from West Indies or Africa - who now live in Britain. Do you 
think there is a lot of prejudice against them in Britain nowadays, a little, 
or hardly any? 

Item 2. Do you think most white people in Britain would mind or not mind 
if a suitably qualified person of Asian origin were appointed as their 
boss? If "would mind", a lot or a little? 
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Item 3. And you personally? Would you mind or not? If "would mind", a 
lot or a little? 

Item 4. Do you think that most white people in Britain would mind or not 
mind if one of their close relatives were to marry a person of Asian 
origin? If "would mind", a lot or a little? 


Table 3. Total scores and CML estimates for the item parameters of the 
RRM 



Item 1 

Item 2 

Item 3 

Item 4 

# of given answers 

1374 

1342 

1377 

1334 

K 

-0.053 

0.322 

-1.074 

0.805 


(0.229) 

(0.189) 

(0.257) 

(0.184) 

#of u l" 

707 

266 

72 

520 


-1.568 

0.317 

2.195 

-0.944 


(0.079) 

(0.094) 

(0.158) 

(0.078) 


Note. Standard errors of the estimates are also provided in parentheses. 


Knott and Tzamourani (1997) recoded the data as follows: "a lot" and 
"mind a lot" were coded as 1, "mind a little", "a little", "hardly any" and 
"not mind" were coded as 0, and "don't know" and "not answered" were 
coded as missing (c, with the notation introduced in this paper). In this way, 
0 2 should measure an attitude towards "non-white" people, or racial 
prejudice, while 0, should measure, in a broad sense, the tendency to 

express an opinion. The primary goal of this case study is to apply the RRM 
to the items from the same dataset. 
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The CML item parameter estimates for the RRM, along with their 
standard errors, are given in Table 3. These estimates are obtained in 
Mathematica Version 7.0 (Wolfram Research, 2008; the corresponding 
code is detailed in the Appendix). 

To evaluate the RRM adequacy, we could assume that the distribution 
of the observed frequencies, over the possible response patterns, follows a 
multinomial distribution. In this case traditional goodness-of-fit tests could 
be applied. Unfortunately, as underlined by Knott and Tzamourani (1997, p. 
249) for these data, the number of possible response patterns (3 4 = 81) is 
substantial; thus, the expected frequencies will tend to be very small for 

certain patterns, such that the usual % 2 -approximations are not valid (see, 
e.g., Tjur, 1982). For Rasch models, Andersen (1973) suggested an 
alternative goodness-of-fit test, based on the CML approach to the 
estimation. For this goodness-of-fit test, the observed counts for each 
response pattern are not required. Instead, the test is based on a partition of 
the total sample of subjects into a convenient number M > 1 of disjoint 
subgroups of subjects, say score groups, with n m subjects in each ( 
m = "homogenous" with respect to the joint minimal sufficient 

statistics (d v .,x v .). The Andersen test is based on the fact that, under the 

Rasch paradigm, we should expect the overall CML-estimates, say 5, 
obtained using the total group of subjects, to be approximately equal to 

those, say obtained by maximizing the log-likelihood function 

corresponding to the score group m , for all m = 1,... ,M. Then, in order to 

evaluate the fit, one can compare how close the CML-estimates 6^ are to 
the overall CML-estimates. This can be done graphically by plotting the 
score group estimates against the overall estimates, and numerically by 
considering a conditional likelihood ratio test based on the test statistic 

S = -2 l c (s)+ j. Under regularity conditions, S is 

asymptotically % 2 distributed, with v = q{M -l)degrees of freedom when 
each n m —> oo, where q is the number of unconstrained item parameters 
(Andersen, 1973). No precise theory exists for forming these score groups. 
Nevertheless, in order to avoid ill-conditioned datasets (i.e. datasets for 
which CML estimates do not exist; see Fischer, 1981), a possible choice is 
given by M = 2 groups according to what detailed in Table 4. Figure 2 
shows the score group CML-estimates plotted against the overall CML- 
estimates. The benchmark line is also superimposed in Figure 2 to facilitate 
comparisons with respect to the optimal situation; it leads us to conclude 
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that the main structure is good. This graphical result is corroborated by the 
observed test statistic s = 5.576 which, on v = 6 degrees of freedom, results 
to be no statistically significant (p-v alue equal to 0.47). In conclusion, with 
a more parsimonious RRM, we have obtained an adequate fit to these data 
without recurring to the 20-parameter model of Knott and Tzamourani 
(1997). 

We also illustrate the MML approach by considering the output 
provided by ConQuest. Table 5 shows the item parameter estimates and 
corresponding fit information. From Table 6 we see that the item parameter 
estimates are very similar under CML and MML approaches (by setting to 
zero the mean for both the 8, and the 5 2 parameters, to facilitate 
comparisons). 


Table 4. Contingency table for number of given answers and number of 
answers in category coded as “1” 


#of“l” 

# of given answers 

0 12 3 4 

Total 

0 

4 

4 

1 

8 6 

14 

2 

13 9 3 

25 

3 

44 39 11 3 

97 

4 

379 485 249 129 26 

1268 

Total 

448 539 263 132 26 

1408 


Note. Two score groups are individuated for the Andersen conditional 
likelihood ratio test: score group G1 (composed by //, =87 subjects) in light 

gray and score group G2 (composed by n 2 = 1321 subjects) in dark gray. 
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Figure 2. Andersen test. Within-score group CML-estimates against 
overall CML-estimates of 6, and d 2 (G1 = score group 1; G2 = score 
group 2). 


Table 5. MML item parameter estimates for the RRM 

(a) (b) 


5 - summary informations 


Item 

6, 

MNSQ 

Cl 

T 

1 

-5.368 (0.20) 

128 

(0.68,1.32) 

1.6 

2 

-4.984 (0.15) 

0.99 

(0.79,1.21) 

0.0 

3 

-6.274(0.21) 

0.99 

(0.67,1.33) 

0.0 

4 

-4.486(0.15) 

0.97 

(0.81,1.19) 

-0.3 


& 2 - summary informations 


Item 


MNSQ 

Cl 

T 

1 

-0.074 (0.06) 

1.10 

(0.95,1.05) 

3.8 

2 

1.803(0.08) 

0.92 

(0.92,1.08) 

-1.9 

3 

3.562(0.13) 

0.98 

(0.80,120) 

-02 

4 

0.595 (0.06) 

1.00 

(0.95,1.05) 

0.1 


Note. Item parameter estimates, as given by the ConQuest output. In this 
calibration, the mean of the latent variables is set to zero. 
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Table 6. Comparison of CML-estimates and MML-estimates of item 
parameters for the RRM 


*1 

(a) 

estimates 


(b) 

estimates 

Item 

CML 

MML 

Item 

CML 

MML 

1 

-0.053 

-0.090 

1 

-1.568 

-1.545 

2 

0.322 

0.294 

2 

0.317 

0.332 

3 

-1.074 

-0.996 

3 

2.195 

2.091 

4 

0.805 

0.792 

4 

-0.944 

-0.876 


Note. To facilitate comparisons, the mean of the item parameters on each 
dimension is constrained to be zero, for both the estimation methods. 


As we can see from Table 5(a) and Table 5(b), ConQuest produces 
the mean squared (MNSQ) fit statistic for every estimated parameter, which 
is based on a standardized comparison between expected and observed 
scores. When the model fits the data, the MNSQ statistics have a unitary 
expected value. These statistics are transformed by ConQuest to 
approximate normal deviates, denoted by T. The software also provides a 
95% confidence interval (Cl) for the expected value of the MNSQ. If the 
MNSQ fit statistic lies outside the Cl, then the corresponding T statistic will 
have an absolute value that roughly exceeds 2 (see Wu et al., 2007, p. 23). It 
is apparent that all the item fit statistics are good, with the exception of the 
parameter 6 21 . To obtain a simple measure about the global fit of the RRM 
we have computed, for each item, the observed and expected frequencies 
for each response option: A (nonresponse); B (response in category coded as 
“0”); C (response in category coded as “1”). As we can see from Table 7, 
the fit seems to be good. 
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Table 7. Observed and expected frequencies. A comparison between 
model (3) (Knott & Tzamourani, 1997) and the RRM (5). 

(a) (b) (c) 


Observed Expected - Model (3) Expected - RRM 


Item 

A 

B 

C 

Item 

A 

B 

C 

Item 

A 

B 

C 

1 

34 

667 

707 

1 

33.95 

667.03 

707.02 

1 

33.56 

665.84 

70839 

2 

66 

1076 

266 

2 

65.97 

1080.56 

261.47 

2 

6439 

1075.75 

267.66 

3 

31 

1305 

72 

3 

30.62 

1304.87 

72.51 

3 

2937 

1306.55 

71.88 

4 

74 

814 

520 

4 

74.84 

813.46 

519.71 

4 

7308 

813.11 

521.80 


Note. Response options: A = nonresponse; B = response in category coded 
as “0”; C = response in category coded as “1”. 


Interestingly, there is agreement with several of the points made by 
Knott and Tzamourani (1997, p. 248). (Note that in their model the 
coefficients a u and a 2j play the role of “factor loadings”). In particular: 

•they find that the discrimination parameter of item 1 (a 221 ) "is close to 

zero indicating that this item does not "load" as high on the underlying 
factor as the other items do. The first item does not seem to measure 
racial prejudice. This can be explained by the fact that the question asked 
for item 1 is too abstract." Indeed, we find evidence of the misfit of this 
item to the RRM also, which is reflected by an under-discriminating item 
(large positive value of T ); 

• they find that "item 2 is the most discriminating item (highest value of 
discrimination parameter)." Indeed, for item 2, we found the largest 
negative value of T; 

•they find that items 2, 3, and 4 correspond to "low probability for a 
positive response for the median individual, i.e., the median individual 
has a small probability of saying that people would have something 
against a "non-white" person." Indeed, the 6 2 s are all positive for these 
items. 
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Conclusions 

In this paper we introduce the RRM, a two-dimensional generalization 
of the simple RM for the treatment of binary data in presence of 
nonignorable nonresponses. The model is suited for intentional omissions - 
i.e. situations in which respondents may decide, for whatever reason, to skip 
the item. The model depends on a vector-valued person variable for two 
latent dimensions: 0j related to the response propensity, and 0 2 that may be 
interpreted (depending on the parameterization) as the usual “ability” 
parameter. In its simplest form, the RRM combines 1) an RM for the 
response variable and 2) an RM for the response indicator variable, but the 
model can also be reparameterized via linear combinations of both the latent 
traits 0, and 0 2 -with different interpretations of the parameters. 

In particular, the model presented can cope with situations where the 
issues under study are sensitive, whether a respondent is embarrassed or not 
willing to reveal his opinion, depending upon circumstances (e.g., within 
the context of polling data, see Rubin, Stem, & Vehovar, 1995, and also 
Smith, Skinner, & Clarke, 1999). As a member of the exponential family, 
the RRM allows the use of conditional inference, but the MML approach 
can also be used to estimate the model parameters. More specifically, under 
the MML approach the RRM can be seen as an instance of a MRCMLM, 
with a within-item multidimensionality, because each item in the test is 
designed to measure both the dimensions 0, and 0 2 ; then, the computer 
program ConQuest can be directly adopted for fitting the model to data as 
well. The results of a simulation indicate that item parameters can be 
estimated accurately under the proposed model. 
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APPENDIX 

Mathematica code used for computing CML estimates of item 
parameters in the RRM 


(* Import data matrix and preliminary quantities *) 

X = Import["X.dat"]; 

Dim = Dimensions[X]; 

n = Dim[[1]]; (* number of subjects *) 

k = Dim[[2]]; (* number of items *) 

c = 9; (* missing value *) 

delta = Array[Subscript[\[Delta], ##] &, {2, k}]; 

(* Response indicator *) 

d = X; 

For[v = l r v <= n, v++, 

For[i = 1, i <= k, i++, 

d[[v, i]] = If[X[[v, i]] == c, 0, 1] 



(* "l"-answer indicator *) 

Xd = X; 

For[v = 1, v <= n, v++, 

For[i = 1, i <= k, i++ f 

Xd[[v f i]] = If[X[£v f i]] == l r 1, 0] 



(* Number of given responses *) 

totOl = ConstantArray[0, n]; 

For[v = l r v <= n, v ++, 

totOl[[v]] = Count[d[[v]], 1] 

] ; 

(* Number of "1" for subject *) 

totl = ConstantArray[0, n]; 

For[v = l r v <= n, v ++, 

totl[[v]] = Count[Xd[[v]], 1] 

] ; 


(* Single conditional likelihood numerator *) 

num = ConstantArray[0, n]; 

For[v = 1, v <= n, v++, 

num[[v]] = Exp[-(delta[[1]].d[[v]]fdelta[[2]].Xd[[v]])] 
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(* All possible 3 A k tuples in the denominator *) 

Tup = Tuples[{c, 0, 1}, k] ; 

For[given = 0, given <= k, given++. 

For [pos = 0, pos <= given, pos++. 
Subscript [a, given, pos]={} 


For[given = 0, given <= k, given++. 

For[pos = 0, pos <= given, pos++, 

For[i =1, i <= 3 A k, i++. 

If [Count[Tup[[i]],c]==k-given && 

Count[Tup[[i]],1]==pos. 

Subscript[a,given,pos]=AppendTo[Subscript[a,given,pos],Tup[[i 


Print[Subscript[a, given, pos]//MatrixForm] 

] 


(* Single denominator *) 

den = ConstantArray[0, n]; 

For[v =1, v <= n, v++, 
den[[v]] = Sum[ 

Exp[-delta[[1]].Boole [(# != c) & 

/©Subscript[a,totOl[[v]],totl[[v]]][[u]]]]* 

Exp[-delta[[2]].Boole[(# == 1) & 

/©Subscript[a,tot01[[v]],totl[[v]]][[u]]]], 

{u,1,Length[Subscript[a,tot01[[v]],totl[[v]]]]} 



(* Conditional likelihood function *) 

LC = Product[num[[v]]/den[[v]],{v,1,n}]; 
res = 

FindMaximum[Log[LC],Union[delta[[1]],delta[[2]]],WorkingPrecision - 
>20] 

(* Identifiability constraints *) 
deltahat = Array[0, {2, k}]; 

deltahat[[1]]=res[[2,l;;k,2]]-Mean[res[[2,1; ; k, 2 ] ] ] ; 
deltahat[[2]]=res[[2,k+1;;2*k,2]]-Mean[res[[2,k+1;;2*k,2]]]; 
deltahat//MatrixForm 
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Note. This code is written in Mathematica Version 7.0 (Wolfram Research, 
2008). It has been used to obtain the CML-estimates of the item parameters 
8 1 and 5 2 in the application to racial prejudice data and can be successfully 
adopted for other short tests (approximately up to k= 10 items). The 
observed data are supposed to be collected in a (nxk)-matrix saved (in the 
working directory of Mathematica) as a plain text file format, named 
X.dat, with tab-separated values coded as c, 0, 1. The code returns a 
(2xk)-matrix, named deltahat, having the CML-estimates of 8 t ' and S 2 ' 
in the first and second row, respectively. 
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